Automatic Sound Classification Inspired by Auditory Scene Analysis
نویسندگان
چکیده
A sound classification system for the automatic recognition of the acoustic environment in a hearing instrument is discussed. The system distinguishes the four sound classes ‘clean speech’, ‘speech in noise’, ‘noise’, and ‘music’ and is based on auditory features and hidden Markov models. The employed features describe level fluctuations, the spectral form and harmonicity. Sounds from a large database are employed for both training and testing of the system. The achieved recognition rates are very high except for the class ‘noise’. Problems arise in the classification of pop music, reverberated speech, tonal noises and singing.
منابع مشابه
Sound Classification in Hearing Aids Inspired by Auditory Scene Analysis
A sound classification system for the automatic recognition of the acoustic environment in a hearing aid is discussed. The system distinguishes the four sound classes “clean speech,” “speech in noise,” “noise,” and “music.” A number of features that are inspired by auditory scene analysis are extracted from the sound signal. These features describe amplitude modulations, spectral profile, harmo...
متن کاملAn Auditory Scene Analysis Approach to Monaural Speech Segregation
A human listener has the remarkable ability to segregate an acoustic mixture and attend to a target sound. This perceptual process is called auditory scene analysis (ASA). Moreover, the listener can accomplish much of auditory scene analysis with only one ear. Research in ASA has inspired many studies in computational auditory scene analysis (CASA) for sound segregation. In this chapter we intr...
متن کاملAuditory Scene Analysis: Computational Models
Listeners have to make sense of a complex acoustic world containing overlapping sound sources that must be organized into individual auditory objects. Computational auditory scene analysis concerns the use of algorithms inspired by human sound perception whose aim is to extract properties of constituent sound sources in a complexmixture. Starting with representations based on models of how soun...
متن کاملDouble-vowel segregation through temporal correlation: a bio-inspired neural network paradigm
A two-layer spiking neural network is used to segregate double vowels. The first layer is a partially connected spiking neurons of relaxation oscillatory type, while the second layer consists of fully connected relaxation oscillators. A twodimensional auditory image generated by the enhanced spectrum of cochlear filter bank envelopes is computed. The segregation is based on a channel selection ...
متن کاملPolyphonic Instrument Recognition Using Spectral Clustering
The identification of the instruments playing in a polyphonic music signal is an important and unsolved problem in Music Information Retrieval. In this paper, we propose a framework for the sound source separation and timbre classification of polyphonic, multi-instrumental music signals. The sound source separation method is inspired by ideas from Computational Auditory Scene Analysis and formu...
متن کامل